Serveur d'exploration SRAS

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Simple Semantics in Topic Detection and Tracking

Identifieur interne : 005678 ( Main/Exploration ); précédent : 005677; suivant : 005679

Simple Semantics in Topic Detection and Tracking

Auteurs : Juha Makkonen [Finlande] ; Helena Ahonen-Myka [Finlande] ; Marko Salmenkivi [Finlande]

Source :

RBID : ISTEX:B2363F20ECE633DCC10A09AD404376A99955A0AC

English descriptors

Abstract

Abstract: Topic Detection and Tracking (TDT) is a research initiative that aims at techniques to organize news documents in terms of news events. We propose a method that incorporates simple semantics into TDT by splitting the term space into groups of terms that have the meaning of the same type. Such a group can be associated with an external ontology. This ontology is used to determine the similarity of two terms in the given group. We extract proper names, locations, temporal expressions and normal terms into distinct sub-vectors of the document representation. Measuring the similarity of two documents is conducted by comparing a pair of their corresponding sub-vectors at a time. We use a simple perceptron to optimize the relative emphasis of each semantic class in the tracking and detection decisions. The results suggest that the spatial and the temporal similarity measures need to be improved. Especially the vagueness of spatial and temporal terms needs to be addressed.

Url:
DOI: 10.1023/B:INRT.0000011210.12953.86


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI wicri:istexFullTextTei="biblStruct">
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Simple Semantics in Topic Detection and Tracking</title>
<author>
<name sortKey="Makkonen, Juha" sort="Makkonen, Juha" uniqKey="Makkonen J" first="Juha" last="Makkonen">Juha Makkonen</name>
</author>
<author>
<name sortKey="Ahonen Myka, Helena" sort="Ahonen Myka, Helena" uniqKey="Ahonen Myka H" first="Helena" last="Ahonen-Myka">Helena Ahonen-Myka</name>
</author>
<author>
<name sortKey="Salmenkivi, Marko" sort="Salmenkivi, Marko" uniqKey="Salmenkivi M" first="Marko" last="Salmenkivi">Marko Salmenkivi</name>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:B2363F20ECE633DCC10A09AD404376A99955A0AC</idno>
<date when="2004" year="2004">2004</date>
<idno type="doi">10.1023/B:INRT.0000011210.12953.86</idno>
<idno type="url">https://api.istex.fr/ark:/67375/VQC-H8H7K8VH-K/fulltext.pdf</idno>
<idno type="wicri:Area/Istex/Corpus">001E57</idno>
<idno type="wicri:explorRef" wicri:stream="Istex" wicri:step="Corpus" wicri:corpus="ISTEX">001E57</idno>
<idno type="wicri:Area/Istex/Curation">001E57</idno>
<idno type="wicri:Area/Istex/Checkpoint">001D56</idno>
<idno type="wicri:explorRef" wicri:stream="Istex" wicri:step="Checkpoint">001D56</idno>
<idno type="wicri:doubleKey">1386-4564:2004:Makkonen J:simple:semantics:in</idno>
<idno type="wicri:Area/Main/Merge">005B24</idno>
<idno type="wicri:Area/Main/Curation">005678</idno>
<idno type="wicri:Area/Main/Exploration">005678</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title level="a" type="main" xml:lang="en">Simple Semantics in Topic Detection and Tracking</title>
<author>
<name sortKey="Makkonen, Juha" sort="Makkonen, Juha" uniqKey="Makkonen J" first="Juha" last="Makkonen">Juha Makkonen</name>
<affiliation wicri:level="4">
<country xml:lang="fr">Finlande</country>
<wicri:regionArea>Department of Computer Science, University of Helsinki, P.O. Box 26 (Teollisuuskatu 23), 00014, FIN-</wicri:regionArea>
<orgName type="university">Université d'Helsinki</orgName>
<placeName>
<settlement type="city">Helsinki</settlement>
<region type="région" nuts="2">Uusimaa</region>
</placeName>
</affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">Finlande</country>
</affiliation>
</author>
<author>
<name sortKey="Ahonen Myka, Helena" sort="Ahonen Myka, Helena" uniqKey="Ahonen Myka H" first="Helena" last="Ahonen-Myka">Helena Ahonen-Myka</name>
<affiliation wicri:level="4">
<country xml:lang="fr">Finlande</country>
<wicri:regionArea>Department of Computer Science, University of Helsinki, P.O. Box 26 (Teollisuuskatu 23), 00014, FIN-</wicri:regionArea>
<orgName type="university">Université d'Helsinki</orgName>
<placeName>
<settlement type="city">Helsinki</settlement>
<region type="région" nuts="2">Uusimaa</region>
</placeName>
</affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">Finlande</country>
</affiliation>
</author>
<author>
<name sortKey="Salmenkivi, Marko" sort="Salmenkivi, Marko" uniqKey="Salmenkivi M" first="Marko" last="Salmenkivi">Marko Salmenkivi</name>
<affiliation wicri:level="4">
<country xml:lang="fr">Finlande</country>
<wicri:regionArea>Department of Computer Science, University of Helsinki, P.O. Box 26 (Teollisuuskatu 23), 00014, FIN-</wicri:regionArea>
<orgName type="university">Université d'Helsinki</orgName>
<placeName>
<settlement type="city">Helsinki</settlement>
<region type="région" nuts="2">Uusimaa</region>
</placeName>
</affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">Finlande</country>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series>
<title level="j">Information Retrieval</title>
<title level="j" type="abbrev">Information Retrieval</title>
<idno type="ISSN">1386-4564</idno>
<idno type="eISSN">1573-7659</idno>
<imprint>
<publisher>Kluwer Academic Publishers</publisher>
<pubPlace>Boston</pubPlace>
<date type="published" when="2004-09-01">2004-09-01</date>
<biblScope unit="volume">7</biblScope>
<biblScope unit="issue">3-4</biblScope>
<biblScope unit="page" from="347">347</biblScope>
<biblScope unit="page" to="368">368</biblScope>
</imprint>
<idno type="ISSN">1386-4564</idno>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt>
<idno type="ISSN">1386-4564</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>geographical ontology</term>
<term>information extraction</term>
<term>retrieval model</term>
<term>temporal expression</term>
<term>topic detection and tracking</term>
</keywords>
</textClass>
<langUsage>
<language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Abstract: Topic Detection and Tracking (TDT) is a research initiative that aims at techniques to organize news documents in terms of news events. We propose a method that incorporates simple semantics into TDT by splitting the term space into groups of terms that have the meaning of the same type. Such a group can be associated with an external ontology. This ontology is used to determine the similarity of two terms in the given group. We extract proper names, locations, temporal expressions and normal terms into distinct sub-vectors of the document representation. Measuring the similarity of two documents is conducted by comparing a pair of their corresponding sub-vectors at a time. We use a simple perceptron to optimize the relative emphasis of each semantic class in the tracking and detection decisions. The results suggest that the spatial and the temporal similarity measures need to be improved. Especially the vagueness of spatial and temporal terms needs to be addressed.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>Finlande</li>
</country>
<region>
<li>Uusimaa</li>
</region>
<settlement>
<li>Helsinki</li>
</settlement>
<orgName>
<li>Université d'Helsinki</li>
</orgName>
</list>
<tree>
<country name="Finlande">
<region name="Uusimaa">
<name sortKey="Makkonen, Juha" sort="Makkonen, Juha" uniqKey="Makkonen J" first="Juha" last="Makkonen">Juha Makkonen</name>
</region>
<name sortKey="Ahonen Myka, Helena" sort="Ahonen Myka, Helena" uniqKey="Ahonen Myka H" first="Helena" last="Ahonen-Myka">Helena Ahonen-Myka</name>
<name sortKey="Ahonen Myka, Helena" sort="Ahonen Myka, Helena" uniqKey="Ahonen Myka H" first="Helena" last="Ahonen-Myka">Helena Ahonen-Myka</name>
<name sortKey="Makkonen, Juha" sort="Makkonen, Juha" uniqKey="Makkonen J" first="Juha" last="Makkonen">Juha Makkonen</name>
<name sortKey="Salmenkivi, Marko" sort="Salmenkivi, Marko" uniqKey="Salmenkivi M" first="Marko" last="Salmenkivi">Marko Salmenkivi</name>
<name sortKey="Salmenkivi, Marko" sort="Salmenkivi, Marko" uniqKey="Salmenkivi M" first="Marko" last="Salmenkivi">Marko Salmenkivi</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Sante/explor/SrasV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 005678 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 005678 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Sante
   |area=    SrasV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     ISTEX:B2363F20ECE633DCC10A09AD404376A99955A0AC
   |texte=   Simple Semantics in Topic Detection and Tracking
}}

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Tue Apr 28 14:49:16 2020. Site generation: Sat Mar 27 22:06:49 2021